Remove batch truncation for fractional batch sizes. #2363

chaserogo · 2025-09-17T18:34:33Z

Description

When running with a per_device_batch_size < 1, rather than splitting larger batches from the data loader into the smaller batch sizes for training, the larger batch is truncated. This PR changes it so that instead the larger batch is split into smaller batches which are looped over instead, so that no data is discarded.

Tests

Please describe how you tested this change, and include any instructions and/or
commands to reproduce.

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed.

…robatches instead.

google-cla · 2025-09-17T18:34:37Z

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

aireenmei · 2025-10-02T00:20:09Z

Thanks for contributing to maxtext! Sorry for the delay in review. Have you completed the CLA? Could you rebase the PR, check the checklist, so we can rerun the tests?

aireenmei

Could you add a unit test in multihost_dataloading_test.py for this feature?

SurbhiJainUSC · 2025-10-06T22:15:09Z

src/MaxText/multihost_dataloading.py


+    sharded_iter = self._base_iter()
+    if self.microbatch_size_to_run:
+       self.local_iterator = itertools.chain.from_iterable(


This can be simplified:

def microbatch_generator(): for global_batch in sharded_iter: yield from self.explode_to_micro(global_batch) if self.microbatch_size_to_run: self.local_iterator = microbatch_generator() else: self.local_iterator = sharded_iter

Remove batch truncation for fractional batch sizes and split into mic…

76f5a35

…robatches instead.

chaserogo requested review from A9isha, NuojCheng, RissyRan, SurbhiJainUSC, aireenmei, bvandermoon, gagika, gobbleturk, hengtaoguo, jiangjy1982, khatwanimohit, richjames0, shralex, vipannalla and yangyuwei as code owners September 17, 2025 18:34

aireenmei added the gemini-review label Oct 2, 2025

aireenmei reviewed Oct 2, 2025

View reviewed changes

SurbhiJainUSC reviewed Oct 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Remove batch truncation for fractional batch sizes. #2363

Remove batch truncation for fractional batch sizes. #2363

Uh oh!

chaserogo commented Sep 17, 2025

Uh oh!

google-cla bot commented Sep 17, 2025

Uh oh!

aireenmei commented Oct 2, 2025

Uh oh!

aireenmei left a comment

Uh oh!

SurbhiJainUSC Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Remove batch truncation for fractional batch sizes. #2363

Are you sure you want to change the base?

Remove batch truncation for fractional batch sizes. #2363

Uh oh!

Conversation

chaserogo commented Sep 17, 2025

Description

Tests

Checklist

Uh oh!

google-cla bot commented Sep 17, 2025

Uh oh!

aireenmei commented Oct 2, 2025

Uh oh!

aireenmei left a comment

Choose a reason for hiding this comment

Uh oh!

SurbhiJainUSC Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants